Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Disambiguation method of multi-feature fusion based on HowNet sememe and Word2vec word embedding representation

WANG Wei, ZHAO Erping, CUI Zhiyuan, SUN Hao

Journal of Computer Applications 2021, 41 (8): 2193-2198. DOI: 10.11772/j.issn.1001-9081.2020101625

Abstract （447）

PDF （1018KB）（486）

Save

Aiming at the problems that the low-frequency words expressed by the existing word vectors are of poor quality, the semantic information expressed by them is easy to be confused, and the existing disambiguation models cannot distinguish polysemous words accurately, a multi-feature fusion disambiguation method based on word vector fusion was proposed. In the method, the word vectors expressed by HowNet sememes and the word vectors generated by Word2vec (Word to vector) were fused to complement the polysemous information of words and improve the expression quality of low-frequency words. Firstly, the cosine similarity between the entity to be disambiguated and the candidate entity was calculated to obtain the similarity between them. After that, the clustering algorithm and HowNet knowledge base were used to obtain entity category feature similarity. Then, the improved Latent Dirichlet Allocation (LDA) topic model was used to extract the topic keywords to calculate the similarity of entity topic feature similarity. Finally, the word sense disambiguation of polysemous words was realized by weighted fusion of the above three types of feature similarities. Experimental results conducted on the test set of the Tibet animal husbandry field show that the accuracy of the proposed method (90.1%) is 7.6 percentage points higher than that of typical graph model disambiguation method.

Reference | Related Articles | Metrics